Mon. Not. R. Astron. Soc. 000, [|-| (0000) Printed 1 February 2008 (MN IAI^X style file vl.4) 



Scaling of voids and fractality in the galaxy distribution 

Jose Gaite and Susanna C. Manrubia 

Centro de Astrobiologta, CSIC-INTA, Ctra. de Ajalvir Km. 4, 28850 Torrejon de Ardoz, Madrid, Spain. 



Accepted 0000 December 00. Received 0000 December 00; in original form 0000 October 00 



(N 
O 
O 
(N 



(N 
> 
oo 
oo 

in 
o 

O 

Of 

6 

C3 



13 



ABSTRACT 

We study here, from first principles, what properties of voids are to be expected in a 
fractal point distribution and how the void distribution is related to its morphology. 
We show this relation in various examples and apply our results to the distribution of 
galaxies. If the distribution of galaxies forms a fractal set, then this property results 
in a number of scaling laws to be fulfilled by voids. Consider a fractal set of dimension 
D and its set of voids. If voids are ordered according to decreasing sizes (largest void 
has rank R = 1, second largest R = 2 and so on), then a relation between size A and 
rank of the form A(i?) oc R~ z must hold, with z — d/D, and where d is the euclidean 
dimension of the space where the fractal is embedded. The physical restriction D < d 
means that z > 1 in a fractal set. The average size A of voids depends on the upper 

(A„) and the lower (A;) cut-off as A oc h^ D ^ d kf^ d . Current analysis of void sizes 
in the galaxy distribution do not show evidence of a fractal distribution, but are 
insufficient to rule it out. We identify possible shortcomings of current void searching 
algorithms, such as changes of shape in voids at different scales or merging of voids, 
and propose modifications useful to test fractality in the galaxy distribution. 



Key words: cosmology: large-scale structure of the universe 
general - methods: statistical 



galaxies: clusters: 



1 INTRODUCTION 

The morphological properties of the distribution of galaxies 
are commonly analyzed by means of the correlation func- 
tions of this distribution, chiefly, the two-point correlation 
function (Peebles, 1980). This correlation function is well fit- 
ted by a power law up to some scale (Peebles, 1980). Within 
this range, the coarse-grained galaxy density exhibits large 
fluctuations associated to various structures, namely, galaxy 
clusters and superclusters of diverse forms, and voids. Most 
studies of galaxy structure have focused on clusters and su- 
perclusters, but the presence of large voids was noted long 
ago and the size of the largest voids detected has steadily 
grown (Einasto, Joeveer & Saar, 1980). The analysis of 
voids is a subject of current interest in cosmology (Einasto, 
Einasto & Gramann, 1989; Vogeley, Geller & Huchra, 1991; 
El- Ad, Piran & da Costa, 1997, Miiller et al., 2000; Hoyle & 
Vogeley, 2002). 

On the one hand, the analysis of correlation functions 
and the hierarchical structure of clusters and superclus- 
ters provides evidence for a self-similar fractal structure 
(Coleman & Pietronero, 1992; Sylos Labini, Montuori & 
Pietronero, 1998), although the scale of transition to a ho- 
mogeneous universe is still matter of debate (Guzzo, 1997; 
Wu, Lahav & Rees, 1999; Chown, 1999; Martinez, 1999). 
On the other hand, the scaling properties of voids are much 
less studied, but scaling of certain quantities has been put 



forward as an indication of self-similarity (Einasto et al., 
1989). 

Here, we begin by studying the void properties of fractal 
distributions in general. Self-similarity is the most obvious 
property and is related to the fractal dimension but there 
are other properties worth considering, such as lacunarity 
(Mandelbrot, 1977), which we define in Section 2. We show 
in examples how to perform a void analysis to obtain the 
fractal dimension and other morphological properties. Then 
we proceed to compare with current void analyses of galaxy 
catalogues, pointing out their relation with our method and, 
according to this method, the conclusions that can be ex- 
tracted from these catalogues. 

Typically, voids are extracted from galaxy catalogues 
by using some void detection algorithm. These algorithms 
provide us with a list of voids, ranked by decreasing size. 
Therefore, these lists are suitable for rank-ordering tech- 
niques common in statistics (Zipf, 1949; Sornette, 2000). In 
particular, the Zipf law, that is, a rank-ordering power law, 
is often indicative of fractal behaviour. In our case, a power 
law cumulative distribution of voids is expected for a geo- 
metric fractal (Mandelbrot, 1977). This cumulative distribu- 
tion corresponds to a rank-ordering Zipf law (with different 
exponent). 

We shall begin studying the rank-ordering of voids in 
simple geometric fractals, namely, Cantor sets, where the 
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Zipf law can be easily proved. Furthermore, we will consider 
some suitable two-dimensional fractals, where the problem 
of definition of voids arises. We will extrapolate the results to 
three-dimensional fractals. In all cases, we will refer to pure 
fractals, which can be characterized by a single exponent, 
that is, their fractal dimension. We will not discuss the more 
complex case of multifractal sets, where a whole spectrum of 
singularity exponents is required to accurately describe their 
geometrical properties (Falconer, 1990). Finally, we apply 
the previous results to current galaxy void catalogues. 



(a) 



(b) 



2 ZIPF'S LAW AND FRACTALITY IN 
CANTOR SETS 

Consider a set of quantities {A*,} corresponding to ft = 
1,2,... measures of a phenomenon. It is common-use to mea- 
sure the probability distribution p(A), that is the probability 
to find an event of size A, in order to quantify the statisti- 
cal properties of the system. An alternative way to carry 
out a similar quantification is provided by the rank-ordering 
technique, introduced by Zipf in the fifties (Zipf, 1949). This 
procedure highlights the properties of the large values of A: 
the largest value is assigned rank R — 1, the second largest 
has R = 2, and so on. The function A(R) conveys an in- 
formation equivalent to p(A). In particular, if A(_R) oc R~ z , 
then p(A) oc A~ a , with a = 1 + 1/z. This relation can be 
explained as follows. Note that p(A) is the fraction of voids 
with size A. Hence, the total number of voids with size larger 
than or equal to A [which corresponds to the function -R(A)] 
is proportional to the accumulated distribution p(A), that 



R(A) oc / p(A)dA. 



(1) 



If p(A) oc A -Q , direct integration returns R(A) oc A 1-Q . 
Inverting it, we obtain the reported relation between z and 
a. 

In order to illustrate the relation between Zipf 's law for 
void sizes and the geometrical properties of a matter distri- 
bution, we begin with Cantor-like sets defined in the unit 
interval. If we restrict to deterministic fractals, a number of 
relevant quantities can be exactly calculated and clearly put 
in correspondence with each other. 

A deterministic Cantor set is generated by an iterative 
procedure. Its generator is characterized by three indepen- 
dent quantities. First, r < 1 is the scaling factor. Usually, 
the unit interval is divided into 1/r pieces of equal length. Of 
these, N intervals remain for the process to be repeated and 
(1/r — N) are eliminated. These two quantities completely 
define the fractal dimension of the asymptotic set, namely 
D = -log N/ log r (Mandelbrot, 1977). Nonetheless, there 
are different ways in which the intervals can be chosen 
(in particular, some of them could be adjacent). Therefore, 
there is still a degree of freedom which translates into a vari- 
able number m < N of voids (or gaps) in the generator. The 
more adjacent intervals, the fewer gaps and smaller m. 

The effect of the parameter m in the morphology of the 
fractal set is quantified through an appropriate measure of 
lacunarity, that is, the quality of having large voids (for a 
given sample size). Figure 1 shows three examples of gener- 
ators and the first iteration for sets with N = 5, r = 1/9 



(C) 



Figure 1. Three generators for Cantor-like fractals in the unit 
interval and the first iteration of the algorithm. We show the 
generator and the scaling rule producing the fractal set: TV repe- 
titions of the scaled set are used to construct the next iteration. 
The three examples shown have a variable number of gaps in the 
generator (a) m = 4, (b) m = 2, (c) m = 1, other parameters 
are shared: N = 5, r = 1/9. Hence, these three fractals have the 
same fractal dimension D = log 5 / log 9 but different lacunarity. 



(hence with the same fractal dimension) and different m. 
The classic triadic Cantor set has N — 2, r = 1/3, and 
m = 1. Finally, note that the average length c(r,N,m) > 1 
(in units of r) of the m initial voids can be obtained from 
the relation 



■ N 



(2) 



where the function c(r, N, m) gives an estimation of the de- 
gree of "adjacency" of voids of size r. In the following, we 
write only c for simplicity. 

After the independent quantities N, r, and m have been 
introduced, we can turn to the explicit calculation of Zipf's 
law and related quantities. In the first iteration of the deter- 
ministic process, we have m voids with ranks from Ri — 1 
to m and length Ai = cr. In the second iteration, there will 
be mN voids occupying ranks from R2 = 1 + m to m + mN, 
and their typical length is A2 = cr 2 . In general, in the i— th 
iteration there are mN 1-1 voids of average size 



A; = cr 



and ranking from 



R l = 1 + m 



N — 1 



(3) 



(4) 



to Ri + mN 1 ^ 1 — 1 = i?i+i — 1. One can verify that the rank 
of the first and the last void in each size class scales in the 
same way with the parameters of the system (the function 
A(R) is step-shaped with steps of equal length in logarithmic 
scale — see the appendix) . For the sake of clarity, we will use 
only the value Ri to calculate the explicit form of Zipf's law, 
defined in parametric form by Eqs. Q and (^). Eliminating 
the parameter i and arranging terms we get (for large R), 



A(R)&f(r,N,m) R~ 1/D , 



(5) 
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as shown in the appendix, where D is the fractal dimension 
of the set and 



into a larger void. In this case, our results can be straight- 
forwardly generalized and Eq. (JB|) reads 



f(r,N,m) = 



1-rN fN- W" 1 ' 



(6) 



Mandelbrot (1997) introduced the gaps' length distri- 
bution N r , defined as the cumulative number of gaps with 
length larger than a certain given scale A and proposed that 
it is a power law with exponent — D (N r oc A~ D ). Now no- 
tice that the rank Ri is defined by adding the total number 
of gaps larger or equal than the i-th gap. Hence, the gaps' 
length distribution corresponds to R(A), which we can get 
through inversion of (^): 

R(A) = F(r, N, m) A~ D , (7) 
and where the prefactor 

F(r, N, m) = f(r, N, m) D = ^ (^^) ° (8) 

is (a measure of) the lacunarity of the fractal set. Indeed, 
F oc m}~ D grows with m, so a fractal is the more "lacunar" 
the smaller is F. F ranges from (1 - rN) D /(N - 1) < 1 for 
m — 1 to one for m = N — 1. We conclude that F^ 1 is a 
measure of lacunarity, in accord with Mandelbrot (1997). 

However, there are other measures of lacunarity and, 
actually, Mandelbrot concluded that it might be best to con- 
sider the fluctuations of the mass function A4(1Z) (Mandel- 
brot, 1977) (defined as the mean mass inside a ball of radius 
TZ centered on a point, M(7Z) oc 1Z D ). This measure of lacu- 
narity is related to the three-point correlation function and 
has been the one most employed (see Blumenfeld & Ball 
(1993) and references therein). 

A particular case of the relations above is provided by 
fractals with maximal m = N — 1 (implying 2 A — 1 = 1/r), 
that is, with gaps of minimal length. For those fractals, re- 
lations (^) and (^) hold exactly for all R and the lacunarity 
F^ 1 — 1 (minimal). The triadic Cantor set and the fractal 
of Fig. 1(a) are examples of this particular type. However, 
the triadic Cantor set is somewhat special: since N — 2, the 
only possible value of m is one and F~ x = 1 is its largest 
possible value. This explains why the gap is relatively large, 
despite the lacunarity being minimal. Nevertheless, we can 
construct fractals with its same dimension and larger lacu- 
narity, for example, by taking r — 1/9 and N = 4; namely, 
the cases m = 1,2. The case m — 3 gives rise to fractals 
with F _1 = 1, one of which actually is the triadic Cantor 
set. 



3 SCALING OF VOIDS IN DIMENSIONS 2 
AND 3 

The definition of void is simple and clear-cut in dimension 
d = 1, because a point divides a segment into two discon- 
nected parts. When dealing with point-sets in higher dimen- 
sions, voids are usually ill defined, since empty areas or vol- 
umes are (usually) connected. Indeed, the factor c(r, N, m) 
in (j3|) was taking care of the connection between adjacent 
voids, and for d > 1 more than one definition is possible. 
A possible generalization of the definition in the previous 
section for d > 1 would be that only voids of equivalent size 
(that is to say, in a given iteration) are allowed to coalesce 



A{R) 



N fN-l 



-d/D 



R 



-d/D 



(9) 



where now A stands for the area or the volume (in units of 
l/r d ) in dimension d = 2, 3, respectively. A particular gener- 
ator is the one that starts with a square and removes a num- 
ber of the 1 /r 2 square parts in a manner symmetrical with 
respect to a diagonal. The result is just a cartesian product 
of one- dimensional Cantor sets. The simplest example is ob- 
tained by taking r = 1/3 and removing five patches forming 
a cross out of the nine initial ones. The fractal so generated 
is the cartesian product of triadic Cantor sets. Clearly, the 
complementary set of these fractals is connected, but one 
can assume that the voids produced at every iteration are 
independent; then, one gets Eq. (^). 

Other definitions of what constitutes a void are also 
possible. If we change our definition, the prefactor of A(R) 
will change its precise form, and the estimated lacunarity 
of the fractal will also change accordingly. Nonetheless, the 
scaling form of the Zipf law for void sizes remains unchanged, 
since it is independent of m and c. In the following, we show 
through numerical examples that any reasonable definition 
of void which is coherently applied at all scales returns the 
correct scaling for the Zipf law, and thus allows one to obtain 
quantitative information on the geometry of the distribution 
of points in the fractal set. 

3.1 Voids in random fractals 

In order to move towards the description of fractals aris- 
ing in natural processes, we should first relax the determin- 
istic character of their construction. Consider now a two- 
dimensional set for which r = 1/3 and N = 4, but where 
the five areas to be removed at each step in the construc- 
tion are chosen at random. However, to mimic the observed 
morphology of the galaxy distribution, we must impose some 
constraints: the galaxy distribution has been characterized 
as a sponge-like network, with filaments and walls where 
galaxies accumulate (Gott, Melott & Dickinson, 1986). In 
two dimensions, we should generate a fractal with some trace 
of filamentary structure. A rough way to achieve this is by 
constraining the five patches removed at every step to form 
a particular "convex-like" shape, namely, a four-piece square 
with an extra piece adjacent to one side. 

Randomization worsens the scaling range, so one should 
iterate the random generator many more times than the de- 
terministic one to keep the scaling range similar. Alterna- 
tively, we will choose to average over independent realiza- 
tions of the fractal constructed with the same number of 
iterations to improve the measures. 

Since the problem to define voids in two-dimensional 
sets is similar to the one arising in three dimensions, we 
will work with d = 2 and extrapolate our results to higher 
dimensionality. Fig. 2 represents a fractal constructed by re- 
moving five randomly chosen (connected) patches of area r 2t 
at iteration i. Now, the possibility arises that voids produced 
in subsequent iterations are adjacent to previously existing 
ones and result in a more or less apparent increase in the 
size of a large void. As we pointed out, we wish to design 
an algorithm to find the voids in our set and apply it at 
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(a) 




Figure 2. (a) Fractal point set generated by randomly removing 
5 connected intervals of linear size r = (1/3) 1 at each iteration i. 
The squares correspond to the largest 25 voids found in the system 
through the algorithm described in the text, (b) Zipf law for void 
sizes calculated according to the same algorithm. The solid line 
has slope —2/D, where D = log 5/ log 3. The normalization of this 
line (its height at the origin) is arbitrary. The numerical data has 
been averaged over 500 independently generated fractals. 



all scales. Our working hypothesis is that the precise shape 
of the void is not relevant to recover the scaling behaviour 
of Zipf's law, as long as it is kept constant through itera- 
tions, and thus it is not relevant either to recover the fractal 
dimension of the associated distribution of matter. 

Simple recursion relations of the previous type (Section 
2) cannot be inferred when studying fractal sets arising from 
physical processes. Instead, one faces a set of points irregu- 
larly distributed in space and has to resort to other methods 
to estimate the distribution of voids. One of the simplest 
ways of defining an area devoid of points in the structure is 
the following: 

(i) Coarse-grain your system by defining elementary cells 
such that there is at the most one point per cell (and defining 
so a lower cutoff to scaling, see section 4) . 

(ii) Decide for regular elements to cover empty areas (say 
a square or a circle in d — 2) . 



(iii) Locate the largest square/circle centered at each 
empty cell (its boundary is limited by filled cells or sam- 
ple boundaries). 

(iv) Select the largest one, which is by definition a void 
of size Ai, and assign it rank R = 1. 

(v) Fill the cells in the selected void (equivalent to those 
in the fractal set). 

(vi) Repeat the procedure with the remaining empty cells 
until all of them are covered (this is somehow reminiscent of 
the box-counting method to estimate the fractal dimension 
(Falconer, 1990)). 

When the algorithm finishes, an ordered list of voids of de- 
creasing size is produced. In Fig. 2a we represent an exam- 
ple of the first stages of the void-finding algorithm applied 
to a random fractal generated with the "filamentary" gen- 
erator, while Fig. 2b represents the obtained function A(R) 
for square voids. As can be seen, the voids measured with 
the previous algorithm follow indeed the scaling expected 
according to the analytic prediction for deterministic frac- 
tals. We have applied our method to fractals constructed 
with several different generators in d = 2 and, in all cases, 
have obtained a good quantitative agreement between the 
predicted slope —d/D and the numerically obtained one. 

Another regular shape for voids that we have investi- 
gated is the circular one. Although largest voids clearly be- 
come even larger when a circular coverage is used, bound- 
aries among voids define smaller voids which, however, are 
not limited by points in the fractal set (see Fig. 3). A first 
attemp to correct for these boundary effects would be to re- 
ject, in the final count, the voids that do not touch at least 
one point of the fractal set (that is, that only touch the ex- 
ternal boundaries and/or other voids). This modification of 
our previously defined algorithm returns a better scaling for 
average sizes in the case of circular voids (both far from the 
boundary and from the elementary cell, see Fig. 3). These 
size classes were overloaded with "artificial" voids placed 
among previously defined voids and/or external boundaries. 



4 MEAN SIZE OF VOIDS IN A FRACTAL 

As shown before, the distribution of voids in a fractal is 
a power law and, therefore, the mean size of voids is not 
a characteristic value, being dependent on the upper and 
lower cutoffs to the fractal scaling. Actually, the mean size 
of voids has been employed as a test for fractality, under 
the assumption that in a fractal the mean size of voids is 
proportional to the size of the sample, that is, the upper 
cutoff (Einasto et al., 1989). We shall show that this is not 
always the case: in fact, the mean size of voids is usually 
dependent on both cutoffs. 

To calculate the mean size of voids we must use the 
probability distribution of sizes p(A) oc A~ a , with a = 1 + 
D/d. Then, 



( /A 



A = 



It A P (A) 
J A %(A)dA 



(10) 



where A; and A„ are the lower and upper cutoffs, respec- 
tively. The computation of the integrals is straightforward 
and 
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R 

Figure 3. (a) The 25 largest circular voids found in the same 
set of Fig. 2 taking into account finite-size corrections, (b) Zipf 's 
law. The dashed line shows A(R) for circular voids calculated ac- 
cording to the algorithm described in section 3. The circles show 
A(i?) for circular voids when only those with boundaries limited 
by the fractal set are counted. The scaling improves in the sec- 
ond case. Again, the solid line only indicates the expected slope 
—2/D. Numerical results have been averaged over 100 indepen- 
dently generated fractals. 



A = 



-a + 1 A„ 



a + 2 



A; 



(11) 



a + 2 Az a+1 - Af 0+1 ' 

For large ratio A„/A; (as required to have a reasonable scal- 
ing range) and taking into account that 1 < a < 2, 

1 



A 



■AT-Af 



(12) 



This expression cannot be reduced anymore and depends on 
both cutoffs. Note that if D = d/2 then a-l = 2-a = l/2 
so that the mean size is just the geometric mean of both 
cutoffs (so to speak, both contribute equally); if D > d/2 
then a — 1 > 2 — a and the lower cutoff contributes more to 
A, and vice versa if D < d/2. 

We now discuss what values we should take for A u and 
A; in point distributions: whereas A„ is the well determined 



size of the sample, the lower cutoff is trickier. Since a random 
fractal has a random component superposed on the deter- 
ministic algorithm that generates it, on the lowest scales 
the random component dominates and the distribution is 
approximately of Poisson type but with a very low mean 
number of points in the associated volume (shot noise). The 
crossover scale from the Poisson to a correlated (fractal) 
regime such that the number function^ Af(1Z) = B1Z D is 
g-i/D ^[jjg sca i e nas been denned by Balian & Schaeffer 
(1989) in a more general context). It is not totally clear 
what value is adequate for A; in a galaxy catalogue. It must 
be larger than (0.1 Mpc) 3 but it can be considerably larger, 
according to the way the catalogue has been compiled and 
the algorithm selected to find voids. At any rate, keeping A; 
fixed one obtains that A is not proportional to A u but rather 
to a power of it with exponent 2 — a = 1 — D/d such that 
0< 1 — D/d < 1 (d = 3). A value of this exponent close to 
1 (as reported by Einasto et al. (1989), see also Section 5)) 
would imply a fractal dimension J) C 1. A set with D = 
is not really a fractal, but a collection of isolated points. 

We have carried out numerical measurements on 2- 
dimensional fractals of known dimension D to test the accu- 
racy to which a real fractal sample follows the scaling (|l2[). 
Figure 4 depicts some of our results. There, we have gen- 
erated a fractal with N — 3 and similarity ratio r = 1/2, 
hence D = 1.58. The single patch to be removed at each it- 
eration was chosen at random (see insert in Fig. 4). In order 
to see how A depends on A u we have kept the lower cut-off 
fixed and equal to the size of the individual cell, A; = 1 and 
varied A u . As long as A; <C A u we observe a neat scaling 
with the predicted exponent 2 — a. Next, the upper cut-off 
was kept fixed to its maximum value for this fractal set, 
A u = 2 14 . The variation of A; was carried out in practice 
by averaging only over voids of area equal to or larger than 
A;. In this second case, the asymptotic scaling exponent is 
a — 1. Numerical results are compared with the two curves 
obtained from Eq. (^) with A; = 1 and A u — 2 14 , respec- 
tively. Only when both cut-offs become of comparable order 
are deviations from scaling observed. 



5 GALACTIC VOIDS 

Galactic voids are vast regions of space apparently devoid 
of luminous matter (galaxies). Several authors have devel- 
oped algorithms in order to detect the extent and frequency 
of such regions in current galaxy catalogues (Einasto et al., 
1989; Kauffmann & Fairall, 1991; Hoyle & Vogeley, 2002). 
The aim of these studies is to gain a better understanding of 
the morphology of the distribution of galaxies, in order to, 
eventually, correlate it with the physical mechanisms respon- 
sible for the observed structure. A first step in this direction 
has been the comparison of measures of voids made on the 
current galaxy catalogues with measures on A-body simula- 
tions of cold dark matter (Muller et al., 2000; Arbabi-Bigdoli 

6 Muller, 2002). The problem of defining what constitutes a 
void in d — 3 has been ubiquitous, and all these studies have 
solved this indeterminacy in different ways. Interestingly, 

* The number function is the mean number of particles inside 
a ball of radius 1Z centered on a particle and equals the mass 
function A4(1Z) divided by the mass of a particle. 
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Figure 4. Scaling of the mean size of voids when one cut-off is 
varied and the other kept fixed. The insert depicts a random frac- 
tal with N = 3 and r = 1/2 embedded in a space of dimension 
d = 2. Solid lines correspond to a numerical solution of the ex- 
pression for A (|tl|) corresponding to this fractal (with D = 1.58). 
The slight difference between numerical and analytical results is 
due to the substitution of a sum (over discrete void areas) by an 
integral in the expression for A (llCj) . 



all of them have looked for maximal volumes of approxi- 
mately convex shape (but differently shaped depending on 
the area studied) inscribed among matter points, while none 
has considered regular volumes. Our main point here is that, 
when trying to recover quantitative information, shape mat- 
ters, implying that voids have to maintain their shape at all 
scales. On the one hand, we have shown that this criterium 
permits us to obtain information on a fractal distribution 
of matter. On the other hand, our definition eliminates cer- 
tain arbitrariness in void finding algorithms, for instance the 
amount of overlap between voids to be merged into a single 
larger void. 

With these caveats in mind, we have examined two stud- 
ies reporting large voids and examined the function A(R) 
that they produce. Recently, Hoyle & Vogeley (2002) have 
examined the Point Source Catalogue Survey (PSCz) and 
the Updated Zwicky Catalog (UZC) for the presence of 
voids. The volume of the 35 and 19 largest voids (respec- 
tively) is plotted in Fig. 5 attending to their rank. Although, 
in these cases, A(R) is relatively well fitted by a straight line 
for the largest voids, the slope is too low to represent the 
complementary set (that is, the set of voids) of a fractal 
distribution of matter. There is a physical restriction to the 
exponent of the scaling law, since the dimension of the frac- 
tal cannot be larger than that of the embedding space, that 
is to say, d > D. This implies that the exponent —d/D has 
to be larger than unity in absolute value. We represent in 
Fig. 5 this restriction and observe that the putative slopes 
are much lower than this value. The classical value of the 
fractal dimension, deduced from the two-point correlation 
function £(r) cx r -7 , is D = 3 — 7 ~ 3 — 1.8 = 1.2. However, 



recent reanalyses of the galaxy catalogues (Sylos Labini et 
al., 1998) yield a larger value, namely, D ~ 3 — 1.1 = 1.9. 
In this case, the expected exponent for A(R) vs R would be 
about —1.5, which we represent as a dotted line in Fig. 5. 

Ten years ago, Kauffmann & Fairall (1991) developed 
an algorithm to search for voids. They reported a list of 129 
'significant' voids obtained from the merged Southern Red- 
shifts Catalogue and the Catalogue of Radial Velocities of 
Galaxies. Their list is represented in Fig. 5 together with 
the previous data. Apart from an initial almost flat stage, 
we have found that this function is reasonably well fitted by 
an exponential law (not shown in the figure). This does not 
correspond to a fractal geometry and would rather corre- 
spond to a Poisson distribution of points, where voids much 
larger than the volume per point should be exponentially 
suppressed. However, they define a significant void as one 
that ". . . occurs in the random catalogue simulations with 
probability less than 1 per cent" (see Kauffmann & Fairall 
(1991) for more details). Moreover, they take as the reference 
Poisson distribution for a given catalogue the one with the 
same number of points but randomly distributed. While one 
certainly cannot consider voids as significant below the scale 
of the lower cutoff A;, where the distribution can be effec- 
tively considered Poissonian (as remarked in section 4) , their 
procedure produces a Poissonian distribution with a much 
larger mean interparticle distance. Hence, even though very 
large voids are almost always significant in their sense, this 
is not the case for average- and small-size voids, which occur 
frequently in a random catalogue with the same number of 
points. Those voids are not included in the list they provide, 
and hence the distribution is strongly depleted in the mean 
and small-void domains. 

Einasto and co-workers (1989) have shown that mean 
void diameters increase with the sample size in a power- 
law manner. They have identified this fact as indicative of 
self-similarity in the matter distribution. Their qualitative 
results agree with the calculations here reported (section 4). 
However, the quantitative result does not agree with our 
prediction. In the work by Einasto et al. (1989), we under- 
stand that measurements were carried out in such a way 
that the lower cut-off was kept fixed, while the upper cut-off 
varied. One expects then a scaling of the form A oc A„" , with 
/3 U = 1 — D/d. Since < D < d, the exponent is bounded, 
< Pu < 1, with (5 U = corresponding to a homogeneous 
distribution of points and /3 U — 1 to an (almost) empty set 
of points. This second value is the one reported by Einasto 
et al. (1989). 

All our considerations apply to galaxies as particles with 
no features, in other words, ignoring type and luminosity. 
It has been argued that voids could be populated by faint 
galaxies, called field galaxies to distinguish them from "nor- 
mal" cluster galaxies (El- Ad et al., 1997; Hoyle & Vogeley, 
2002). Luminosity can be taken into account by generaliz- 
ing the concept of a fractal distribution to a multifractal 
distribution (Sylos Labini et al., 1998) which we do not con- 
sider here. If the neglected field galaxies have a distribution 
with geometrical properties different from the more lumi- 
nous ones, the multifractal model would not apply. However, 
the standard biased galaxy formation picture attributes sim- 
ilar scaling properties to the distributions at various biasings 
(Gabrielli, Sylos Labini & Durrer, 2000), in accord with a 
multifractal distribution. A sort of biasing could be mim- 
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sition to homogeneityjj takes place. Since voids involved in 
Zipf 's plot would cover the whole range of sizes, there might 
be some systematic deviations in case the fractal dimension 
is scale dependent: D decreases with decreasing scale, hence 
the exponent of the rank-ordering plot increases in absolute 
value, and the function becomes concave from below. 

• Transition to homogeneity - There cannot be voids 
larger than the characteristic length at which the universe 
becomes homogeneous; but the characteristic size of the 
largest voids is an independent scale (Balian & Schaeffer, 
1989) that could be significantly smaller than the homo- 
geneity scale. Furthermore, the breakdown of scaling in the 
void distribution at the characteristic size of the largest 
voids might suggest that the recent observations reported 
by Hoyle & Vogeley (2002) and returning a flat A(R) are 
related to it. 



Figure 5. Zipf's plot for three available void catalogues. Data 
represented as circles (PSCz) and squares (UZC) are from Hoyle 
& Vogeley (2002), units are h — 3 Mpc 3 ; data in crosses from Kauff- 
mann & Fairall (1991), units are Km 3 s" 3 . Under the hypothesis 
that matter is self-similarly distributed in the universe, the scal- 
ing expected for void volumes is A(i?) ~ R~ 3 / D , where D stands 
for the fractal dimension of the galaxy distribution. The solid 
line has slope —1 (indeed, any function A(iJ) has to have slope 
larger than unity in absolute value). The dotted line signals the 
expected scaling if, according to recent measures, D ~ 2, leading 
to the —1.5 scaling relation shown in the figure. The dotted and 
the solid lines just indicate the (expected) slope of A(i?) in each 
case. 



icked for a pure fractal by randomly removing a fraction of 
points, which would not alter its scaling properties. At any 
rate, a thorough analysis of this question falls beyond the 
scope of this work. 



5.1 Sources of deviation from scaling 

Apart from our previous discussion on the way in which 
voids are defined and counted, there are a number of mech- 
anisms which, in our understanding, could produce a signif- 
icant deviation from the scaling regime. We list them and 
briefly discuss their effects. Sometimes the source of devia- 
tions can be identified and even eliminated. But often, even 
in what sense they would affect Zipf's plot is unclear. The 
following list might not be exhaustive, but some or even all 
of the listed problems can affect the observations to date. 
However, note that scaling corresponding to D > d should 
not occur, since it is physically forbidden. 

• Finite size effects - Usually, it is unavoidable to use a 
"boundary" to limit the fractal set that we are measuring. 
We have already seen that, in particular for certain shapes 
of voids, systematic deviations from scaling can be obtained. 

• Scale-dependent dimension of galaxy distribution - It 
has been recently reported (Bak & Chen, 2001) that the 
dimension of the galaxy distribution varies with the obser- 
vation scale. It grows smoothly from zero when approaching 
the size of single galaxies to three at the scale where the tran- 



6 CONCLUSIONS 

There is a quantitative and well-defined relationship be- 
tween the dimension D of a fractal set and the exponent of 
Zipf's plot A(R) for the corresponding void sizes A. We have 
illustrated this dependency with regular fractals in d — 1 , for 
which exact relations have been derived. Next, the introduc- 
tion of a simple algorithm to identify voids in any dimension 
has allowed us to show that the relation A(R) ~ R~ d ^ D also 
holds in stochastic fractals defined in dimension d = 2. We 
have shown that the mean size of voids in a sample defined 
between a lower and a higher cut-off scales with these quan- 
tities, A oc Af"Af ! . The exponents f3 u and Pi depend on 
the fractal dimension D; hence, the relation between A and 
the two cut-offs depends on the fractal. Our results can be 
straightforwardly extrapolated to d — 3. 

This study has been performed with the aim of apply- 
ing it to current measures of the distribution of galaxies. On 
the one hand, current measures of the two-point correlation 
function seem to be consistent with a fractal distribution, 
with a yet uncertain dimension 1 < D < 2, and in a still 
controversial range from 1 to, perhaps, ~ 100 h _1 Mpc (or 
even more) (Guzzo, 1997; Sylos Labini et al., 1998; Martinez, 
1999). On the other hand, current void catalogues (Kauff- 
mann & Fairall, 1991; Hoyle & Vogeley, 2002) do not seem 
to support this result. Nonetheless, attending the discussion 
which conforms the body of our work, they are insufficient 
to discard the hypothesis of a fractal distribution of galax- 
ies. To assess the capability of void finding algorithms to 
detect fractal structure and then the fractal dimension D, 
we would suggest that they be tested with simple examples, 
for which exact results can be easily obtained, as we have 
done here. 

It would be interesting to extend the measures of void 
sizes to scales smaller than the ones usually probed in a sys- 
tematic way and, specifically, compare them with a scaling 
distribution. The careful construction of the Zipf plot of void 
sizes and the analysis of its scaling (or not), as well as the 

T The concept of homogeneity involves some subtleties (Gaite, 
Domi'nguez & Perez-Mercader, 1999). Here and henceforth, we 
mean by homogeneity that the relative density fluctuations are 
small. 
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scaling of the average size of voids with the sample size are 
complementary measures to n— point correlation functions 
and additional support for the fractality of the distribution 
of galaxies. In any case, it is clear that the two methods must 
provide equivalent information, meaning that the study of 
the convergence of both approaches could help distinguish 
different sources of deviation from scaling and moreover bet- 
ter characterize the morphology of the galaxy distribution. 
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for large Ri (which implies that i is large, assuming that N 
and m are not). The step length in logarithmic scale is 



ln(Ri + mN l 



In Ri = In 1 + 



Ri 



In AT. (A2) 



On the other hand, within the same approximation, af- 
ter taking logarithms of equations (H) and (tf), 



In Ri = In 



In Ai — In c + i In r, 
+ (i-l)]nN. 



N - 



(A3) 
(A4) 



Now, it is easy to solve for i in the second equation and 
substitute it in the first one, obtaining 



In A , 



In r 



In Ri -\n[m/(N - 1)] 



In N 



lnc 



-± (lni^ + m — 
D \ m 



+ In r + In c. 



(A5) 



From cm + N = 1/r, rc = (l — rN)/m. Then, after removing 
the logarithms, 



■R. 



-l/D 



(A6) 



Let us briefly analyze the accuracy of the approximation 
( Al). For the example in Fig. 1, with N — 5 and m = 1, 2, 4, 
7?3 = 7, 13, 25, whereas the approximation yields 6.25, 12.5, 
25. Of course, the accuracy is higher for i > 3. In general, 
the relative error is 0(N~ l ). 

This paper has been produced using the Royal Astronomical 
Society/Blackwell Science I^TfrjX style file. 



APPENDIX A: ZIPF'S LAW FOR A 
DETERMINISTIC FRACTAL 

We show here that the function A(J?) has (approximately) 
equal steps in logarithmic scale and how to eliminate the 
discrete variable i from equations (|^) and (Q). 

We can asymptotically approximate Eq. (Q) by 
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